• A scalable decision-tree-based method to explain interactions in dyadic data 

      Eiras-Franco, Carlos; Guijarro-Berdiñas, Bertha; Alonso-Betanzos, Amparo; Bahamonde, Antonio (Elsevier, 2019-12)
      [Abstract]: Gaining relevant insight from a dyadic dataset, which describes interactions between two entities, is an open problem that has sparked the interest of researchers and industry data scientists alike. However, ...
    • Fast Distributed kNN Graph Construction Using Auto-tuned Locality-sensitive Hashing 

      Eiras-Franco, Carlos; Martínez Rego, David; Kanthan, Leslie; Piñeiro, César; Bahamonde, Antonio; Guijarro-Berdiñas, Bertha; Alonso-Betanzos, Amparo (Association for Computing Machinery, 2020)
      [Abstract]: The k-nearest-neighbors (kNN) graph is a popular and powerful data structure that is used in various areas of Data Science, but the high computational cost of obtaining it hinders its use on large datasets. ...
    • Interpretable market segmentation on high dimension data 

      Eiras-Franco, Carlos; Guijarro-Berdiñas, Bertha; Alonso-Betanzos, Amparo; Bahamonde, Antonio (M D P I AG, 2018-09-17)
      [Abstract] Obtaining relevant information from the vast amount of data generated by interactions in a market or, in general, from a dyadic dataset, is a broad problem of great interest both for industry and academia. Also, ...
    • Large scale anomaly detection in mixed numerical and categorical input spaces 

      Eiras-Franco, Carlos; Martínez Rego, David; Guijarro-Berdiñas, Bertha; Alonso-Betanzos, Amparo; Bahamonde, Antonio (Elsevier, 2019)
      [Abstract]: This work presents the ADMNC method, designed to tackle anomaly detection for large-scale problems with a mixture of categorical and numerical input variables. A flexible parametric probability measure is ...
    • Scalable Feature Selection Using ReliefF Aided by Locality-Sensitive Hashing 

      Eiras-Franco, Carlos; Guijarro-Berdiñas, Bertha; Alonso-Betanzos, Amparo; Bahamonde, Antonio (Wiley, 2021)
      [Abstract] Feature selection algorithms, such as ReliefF, are very important for processing high-dimensionality data sets. However, widespread use of popular and effective such algorithms is limited by their computational ...